-
Notifications
You must be signed in to change notification settings - Fork 357
feat(xlang): support serialization for unsigned types and field encoding config #3113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
chaokunyang
merged 45 commits into
apache:main
from
chaokunyang:support_unsigned_types_for_java
Jan 10, 2026
Merged
feat(xlang): support serialization for unsigned types and field encoding config #3113
chaokunyang
merged 45 commits into
apache:main
from
chaokunyang:support_unsigned_types_for_java
Jan 10, 2026
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pandalee99
approved these changes
Jan 7, 2026
1c1ae07 to
43b7783
Compare
85d3b9b to
af664fd
Compare
2 tasks
af664fd to
4b225ba
Compare
- Add missing Apache license header to DispatchId.java - Fix ClassCastException in DefaultValueUtils.setDefaultValues by using Number interface for type conversion instead of direct casts
…rackingRef is false When global ref tracking is enabled, serializers call reference() at the end of deserialization. If a field has trackingRef=false (e.g., in xlang mode where all fields default to trackingRef=false), we need to push a stub -1 via preserveRefId() so that reference() can pop it and skip setReadObject. The fix checks if the TYPE normally needs ref tracking (ignoring field-level metadata) by using TypeRef.of(typeRef.getRawType()). This ensures the stub is pushed when needed, preventing ArrayIndexOutOfBoundsException when the serializer calls reference() on an empty readRefIds stack.
Use Types.getTypeId() instead of ClassResolver registered IDs for determining dispatch IDs in DefaultValueUtils. This ensures consistent type IDs between DispatchId constants and the values used in setDefaultValues. Also convert default values to correct types during initialization to avoid repeated type conversion at runtime.
2 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why?
Java doesn't have native unsigned integer types, but many other languages (Rust, Go, C++, Python with ctypes) do. When serializing data across languages, we need to properly handle unsigned integers to ensure correct values and efficient encoding.
For example:
u32with value3_000_000_000cannot be directly represented in Java's signedint(max ~2.1 billion)What does this PR do?
1. Adds Unsigned Integer Type Support (All Languages)
UINT8(9),UINT16(10),UINT32(11),VAR_UINT32(12),UINT64(13),VAR_UINT64(14),TAGGED_UINT64(15)2. Renames Type Constants for Clarity (All Languages)
VAR32→VARINT32VAR64→VARINT64H64→TAGGED_INT64VARU32→VAR_UINT32VARU64→VAR_UINT64HU64→TAGGED_UINT643. Java: Adds Type Annotations for Field-Level Control
New annotations allow specifying exact encoding at field level:
@Uint8Type- Mark field as unsigned 8-bit [0, 255]@Uint16Type- Mark field as unsigned 16-bit [0, 65535]@Uint32Type(compress=true/false)- Unsigned 32-bit with optional varint encoding@Uint64Type(encoding=FIXED/VARINT/TAGGED)- Unsigned 64-bit with encoding options@Int32Type(compress=true/false)- Signed 32-bit with optional varint encoding@Int64Type(encoding=FIXED/VARINT/TAGGED)- Signed 64-bit with encoding options4. C++: Adds
FORY_FIELD_CONFIGMacro for Field Encoding Control5. Rust: Extends
#[fory(...)]Derive Macro with Encoding Attributes6. Go: Extends Struct Tags with
compressandencodingOptionsOptions:
compress=true/false: For int32/uint32, controls varint vs fixed encodingencoding=varint/fixed/tagged: For all numeric types, explicitly sets encoding7. Python: Adds Type Hints for Encoding Control
8. Java Internal Changes
DispatchIdclass handles type dispatching in code generationRelated issues
Closes #3110
Closes #2914
#3099
#1017
#2906
#2982
Does this PR introduce any user-facing change?
Does this PR introduce any public API change?
@Uint8Type,@Uint16Type,@Uint32Type,@Uint64Type,@Int32Type,@Int64TypeFORY_FIELD_CONFIGmacro for encoding configurationcompressandencodingattributes in#[fory(...)]derive macrocompressandencodingoptions in struct tagsfixed_int32,tagged_int64,uint32, etc.)VAR32→VARINT32)Does this PR introduce any binary protocol compatibility change?
Benchmark
N/A - This PR focuses on correctness and cross-language compatibility. Performance characteristics of unsigned types are similar to their signed counterparts.